home *** CD-ROM | disk | FTP | other *** search
- Path: gryphon.phoenix.net!usenet
- From: brucew@phoenix.net (Bruce Wedding)
- Newsgroups: comp.lang.c
- Subject: Re: Count lines in file?
- Date: Mon, 01 Jan 1996 06:40:34 GMT
- Organization: BranPaul Systems
- Message-ID: <4c7t8p$p5l@gryphon.phoenix.net>
- References: <4bfnqu$btj@news1.netzone.com> <4bi4nr$3qe@castle.nando.net> <4bk30e$l08@news1.netzone.com> <Pine.A32.3.91.951224222853.23094E@red.weeg.uiowa.edu> <4bm9tk$h5a@crl14.crl.com> <820334651snz@genesis.demon.co.uk> <4c4v8b$kb6@crl5.crl.com>
- NNTP-Posting-Host: dial22.phoenix.net
- X-Newsreader: Moe's Newsreader
-
- kossick@crl.com (Paul J. Kossick) wrote:
-
-
- >That's an interesting point I thought of when talking to someone else
- >about this via E-Mail...I would think that fgets would be SLOWER than
- >getc, since it actually has to execute code similar to getc to fill the
- >buffer in the first place...as well as other work such as checking to see
- >if the buffer is full. Since you don't NEED to retain the lines intact
- >in order to count them, a buffer is basicly surplusage.
- >
- >Oh, before someone else points it out: One problem with getc might be if
- >the last line of the program doesn't have a terminating newline
- >character, i.e. it terminates with the eof. This could cause the count
- >to be off by one in such cases. fgets avoids this by counting the line
- >anyways (though there's still that full-buffer problem), and I suppose
- >some code could be devised to make getc deal with it.
-
- You were right about this. I just let it go though.
-
- >Anyone want to run these as an experiment, and report the results to the
- >group? I'd do it, but why should I hog all the fun? ;)
-
- Here is the code I ran. I've determined that the results are
- completely inconclusive and probably not only compiler dependant, but
- compiler switch dependant also. I ran it on a 6000 line file,
- compiled with MSC 8.00c on a dos box. With the default command line,
- the fgets was usually about 50-100 % faster. With the optimizer set
- to "fastest code" ( /O2 for MSC), the getc() version was marginally
- faster or equal. I'd say it really doesn't make a difference which to
- use.
-
- Here is the code:
-
- #include <stdio.h>
- #include <stdlib.h>
- #include <time.h>
-
- long int char_by_char( FILE *fp);
- long int line_by_line( FILE *fp);
-
- long int char_by_char( FILE *fp)
- {
- long int count = 0;
- int c;
-
- while ((c = getc(fp)) != EOF)
- {
- if (c == '\n')
- ++count;
- }
- return count;
- }
-
-
- long int line_by_line( FILE *fp)
- {
- char buf[256];
- long int count = 0;
-
- while ( fgets(buf, 256, fp))
- {
- ++count;
- }
- return count;
- }
-
- int main (int argc, char *argv[])
- {
- FILE *fp;
-
- time_t start = 0, end = 0;
- long int nl = 0;
-
- fp = fopen(argv[1],"rb");
- if (fp == NULL)
- {
- printf("Cannot open %s\n", argv[1]);
- exit(1);
- }
-
- start = clock();
- nl = char_by_char( fp);
- end = clock();
- printf("Using getc(): ET: %ld \t %ld Lines\n", end - start, nl);
-
- nl = 0;
- rewind(fp);
- start = clock();
- nl = line_by_line( fp );
- end = clock();
- printf("Using fgets(): ET: %ld \t %ld Lines\n", end - start,
- nl);
-
- nl = 0;
- rewind(fp);
- start = clock();
- nl = char_by_char( fp);
- end = clock();
- printf("Using getc(): ET: %ld \t %ld Lines\n", end - start, nl);
-
- nl = 0;
- rewind(fp);
- start = clock();
- nl = line_by_line( fp );
- end = clock();
- printf("Using fgets(): ET: %ld \t %ld Lines\n", end - start,
- nl);
-
- return 0;
- }
-
- Bruce D. Wedding Have Compiler, Will Travel!
- Perspicacious Progamming Performed Promptly
- Katy, Texas, USA, Planet Earth, Milkyway Galaxy, Known Universe
-
-